Goto

Collaborating Authors

 online prediction



Online Prediction of Switching Graph Labelings with Cluster Specialists

Neural Information Processing Systems

We address the problem of predicting the labeling of a graph in an online setting when the labeling is changing over time. We present an algorithm based on a specialist approach; we develop the machinery of cluster specialists which probabilistically exploits the cluster structure in the graph. Our algorithm has two variants, one of which surprisingly only requires O(log n) time on any trial t on an n-vertex graph, an exponential speed up over existing methods. We prove switching mistake-bound guarantees for both variants of our algorithm. Furthermore these mistake bounds smoothly vary with the magnitude of the change between successive labelings. We perform experiments on Chicago Divvy Bicycle Sharing data and show that our algorithms significantly outperform an existing algorithm (a kernelized Perceptron) as well as several natural benchmarks.


Adaptive Selective Sampling for Online Prediction with Experts

Neural Information Processing Systems

We consider online prediction of a binary sequence with expert advice. For this setting, we devise label-efficient forecasting algorithms, which use a selective sampling scheme that enables collecting much fewer labels than standard procedures. For the general case without a perfect expert, we prove best-of-both-worlds guarantees, demonstrating that the proposed forecasting algorithm always queries sufficiently many labels in the worst case to obtain optimal regret guarantees, while simultaneously querying much fewer labels in more benign settings. Specifically, for a scenario where one expert is strictly better than the others in expectation, we show that the label complexity of the label-efficient forecaster is roughly upper-bounded by the square root of the number of rounds. Finally, we present numerical experiments empirically showing that the normalized regret of the label-efficient forecaster can asymptotically match known minimax rates for pool-based active learning, suggesting it can optimally adapt to benign settings.


Online Prediction with Selfish Experts

Neural Information Processing Systems

We consider the problem of binary prediction with expert advice in settings where experts have agency and seek to maximize their credibility. This paper makes three main contributions. First, it defines a model to reason formally about settings with selfish experts, and demonstrates that ``incentive compatible'' (IC) algorithms are closely related to the design of proper scoring rules.



Adaptive Selective Sampling for Online Prediction with Experts

Neural Information Processing Systems

We consider online prediction of a binary sequence with expert advice. For this setting, we devise label-efficient forecasting algorithms, which use a selective sampling scheme that enables collecting much fewer labels than standard procedures. For the general case without a perfect expert, we prove best-of-both-worlds guarantees, demonstrating that the proposed forecasting algorithm always queries sufficiently many labels in the worst case to obtain optimal regret guarantees, while simultaneously querying much fewer labels in more benign settings. Specifically, for a scenario where one expert is strictly better than the others in expectation, we show that the label complexity of the label-efficient forecaster is roughly upper-bounded by the square root of the number of rounds. Finally, we present numerical experiments empirically showing that the normalized regret of the label-efficient forecaster can asymptotically match known minimax rates for pool-based active learning, suggesting it can optimally adapt to benign settings.


Model-free Online Learning for the Kalman Filter: Forgetting Factor and Logarithmic Regret

Qian, Jiachen, Zheng, Yang

arXiv.org Artificial Intelligence

We consider the problem of online prediction for an unknown, non-explosive linear stochastic system. With a known system model, the optimal predictor is the celebrated Kalman filter. In the case of unknown systems, existing approaches based on recursive least squares and its variants may suffer from degraded performance due to the highly imbalanced nature of the regression model. This imbalance can easily lead to overfitting and thus degrade prediction accuracy. We tackle this problem by injecting an inductive bias into the regression model via {exponential forgetting}. While exponential forgetting is a common wisdom in online learning, it is typically used for re-weighting data. In contrast, our approach focuses on balancing the regression model. This achieves a better trade-off between {regression} and {regularization errors}, and simultaneously reduces the {accumulation error}. With new proof techniques, we also provide a sharper logarithmic regret bound of $O(\log^3 N)$, where $N$ is the number of observations.


Personalized Federated Learning with Mixture of Models for Adaptive Prediction and Model Fine-Tuning

Ghari, Pouya M., Shen, Yanning

arXiv.org Artificial Intelligence

Federated learning is renowned for its efficacy in distributed model training, ensuring that users, called clients, retain data privacy by not disclosing their data to the central server that orchestrates collaborations. Most previous work on federated learning assumes that clients possess static batches of training data. However, clients may also need to make real-time predictions on streaming data in non-stationary environments. In such dynamic environments, employing pre-trained models may be inefficient, as they struggle to adapt to the constantly evolving data streams. To address this challenge, clients can fine-tune models online, leveraging their observed data to enhance performance. Despite the potential benefits of client participation in federated online model fine-tuning, existing analyses have not conclusively demonstrated its superiority over local model fine-tuning. To bridge this gap, the present paper develops a novel personalized federated learning algorithm, wherein each client constructs a personalized model by combining a locally fine-tuned model with multiple federated models learned by the server over time. Theoretical analysis and experiments on real datasets corroborate the effectiveness of this approach for real-time predictions and federated model fine-tuning.


Adaptive Selective Sampling for Online Prediction with Experts

Neural Information Processing Systems

We consider online prediction of a binary sequence with expert advice. For this setting, we devise label-efficient forecasting algorithms, which use a selective sampling scheme that enables collecting much fewer labels than standard procedures. For the general case without a perfect expert, we prove best-of-both-worlds guarantees, demonstrating that the proposed forecasting algorithm always queries sufficiently many labels in the worst case to obtain optimal regret guarantees, while simultaneously querying much fewer labels in more benign settings. Specifically, for a scenario where one expert is strictly better than the others in expectation, we show that the label complexity of the label-efficient forecaster is roughly upper-bounded by the square root of the number of rounds. Finally, we present numerical experiments empirically showing that the normalized regret of the label-efficient forecaster can asymptotically match known minimax rates for pool-based active learning, suggesting it can optimally adapt to benign settings.


Reviews: Online Prediction with Selfish Experts

Neural Information Processing Systems

In this paper, the standard (binary outcome) "prediction with expert advice" setting is altered by treating the experts themselves as players in the game. Specifically, restricting to the class of weight-update algorithms, the experts seek to maximize the weight assigned to them by the algorithm, as a notion of "rating" for their predictions. It is shown that learning algorithms are IC, meaning that experts reveal their true beliefs to the algorithm, if and only if the weight-update function is a strictly proper scoring rule. Because the weighted-majority algorithm (WM) has a weight update which is basically the negative of the loss function, WN is thus IC if and only if the (negative) loss is strictly proper. As the (negative) absolute loss function is not strictly proper, WM is not IC for absolute loss.